Synthesis and screening of ligands using X-ray crystallography

ABSTRACT

A method for identifying a ligand of a target macromolecule is disclosed, comprising the steps of: soaking one or more crystals of the target macromolecule in a solution containing a collection of compounds generated in situ or separate from the crystal, where the solution has been prepared without the purification of the synthesized collection of compounds; obtaining an X-ray crystal diffraction pattern of the soaked macromolecule crystal; and using said X-ray crystal diffraction pattern to identify any compound bound to the macromolecule crystal, said compound being a ligand of the target macromolecule.

This application is the U.S. national phase of international applicationPCT/GB2003/005595 filed 22 Dec. 2003 which designated the U.S. andclaims benefit of GB 0230172.9, filed 24 Dec. 2002, and claims priorityto U.S. Provisional Application No. 60/435,349 filed 23 Dec. 2002, theentire contents of each of which are hereby incorporated by reference.

This invention relates to methods for synthesising compounds andidentifying from those compounds ligands that bind target macromoleculesusing X-ray crystallography.

BACKGROUND

Determining the structure of proteins by X-ray crystallography is anelegant and reliable method and is the basis of structure-based liganddesign in which small molecules are synthesized as potential ligands forthe protein of interest. This is an intense area of research for theoptimisation of ligands to drugs for therapeutically interestingproteins (see Babine, R. E. and Bender, S. L., Chemical Reviews, 97,1359–1472 (1997) and Bohacek, R. S., et al., Med. Res. Rev., 16, 3–50(1996)).

One method of ligand screening is described in WO 99/45379, in which alibrary of shape-diverse compounds thought to be potential ligands aresoaked or co-crystallised with a target protein, and then the resultingcomplex is analysed by X-ray crystallography to determine the nature ofthe ligand which has bound. The library of compounds which is used inthe screening process generally comprises previously characterisedcompounds.

SUMMARY OF THE INVENTION

The above described method requires every potential ligand to besynthesized, purified and characterized before it can form part of alibrary for screening. If the potential ligands are simply purchasedfrom commercial sources, their costs will typically be high due to theamount of work required to carry out these three steps. If, in thealternative, the compounds are to be synthesized ‘in-house’, then agreat deal of time and effort will need to be expended on assembling thelibrary of ligands for screening.

The present inventors have developed a method where a collection ofcompounds is synthesized and then screened without the need for anypurification and/or characterisation steps.

Accordingly, the present invention provides a method for identifying aligand of a target macromolecule comprising the steps of:

-   -   a) soaking one or more crystals of the target macromolecule in a        solution containing a collection of compounds generated in situ        or separate from the crystal, where the solution has been        prepared without the purification, and preferably without the        characterisation, of the synthesized collection of compounds;    -   b) obtaining an X-ray crystal diffraction pattern of the soaked        macromolecule crystal; and    -   c) using said X-ray crystal diffraction pattern to identify any        compound bound to the macromolecule crystal, said compound being        a ligand of the target macromolecule.

Any solvated crystal system in which the solvent and/or ligand moleculesare able to infiltrate throughout the crystal via diffusion, and wherethe crystal system is compatible with X-ray diffraction data collection,is suitable for use in the invention.

Examples of appropriate macromolecules are polypeptides (proteins),ribose nucleic acids (RNAs, ribozymes etc), deoxy ribose nucleic acids(DNAs), and complexes of combinations of the three examples, e.g.ribosomes, or viruses (DNA and/or RNA-protein complexes).

A ligand is a molecule which can bind to a macromolecule. For apolypeptide chain (protein), this is anything that is not coded for bythe DNA sequence of the protein. This covers the post-translationalmodification of proteins (e.g. covalent attachment of sugars, etc.), thecovalent and non-covalent attachment of cofactors (e.g. Haem groups),the binding of other polypeptides or amino acids, the binding of smallmolecules (e.g. drugs, substrates, etc.) and the binding of DNA and RNAto proteins. For nucleic acids this is molecules either covalently, ornon-covalently bound to DNA or RNA (e.g. ligands intercalated betweenbases).

In the present invention, the compounds can be synthesised by parallelsynthesis, or (more conveniently) by combinatorial chemistry.Traditionally, combinatorial chemistry is used to generate smallmolecule inhibitors for screening against one or more biologicaltargets. The synthesis of libraries of compounds has generally beenaimed at producing either compounds as purified single entities or ashigh-quality mixtures of compounds using methodology that allows fordeconvolution of the mixture, once it has been determined that an activecompound is to be found in that mixture. Deconvolution requires there-testing of each member compound of the active library. Thus eithermethod requires the careful analysis and characterisation of thelibraries to allow for interpretation of the data generated duringbiological screening against the target macromolecule. As mentionedabove, the present invention does not require the purification and/orcharacterisation of the members of the synthesised library, as theidentity of the ligand is determined by X-ray crystallography of theligand-macromolecule complex.

The solution containing the collection of compounds can be prepared intwo main ways, herein called ‘in-situ’ synthesis and ‘just-in-time’synthesis.

‘In-situ’ synthesis involves synthesizing the collection of compounds ina solution which also contains the one or more crystals of the targetmacromolecule, and therefore requires the use of chemistries which canbe carried out under conditions in which the macromolecule crystal willremain stable.

Accordingly, a first aspect of-the present invention provides a methodfor identifying a ligand of a target macromolecule comprising the stepsof:

-   -   a) synthesizing a collection of compounds, which are suitable        for screening against a target macromolecule, in a solution        containing one or more crystals of the target macromolecule;    -   b) obtaining an X-ray crystal diffraction pattern of the soaked        macromolecule crystal; and    -   c) using said X-ray crystal diffraction pattern to identify any        compound bound to the macromolecule crystal, said compound being        a ligand of the target macromolecule.

In this method, the synthesis of the collection of compounds will takeplace in a single reaction vessel, i.e. the vessel in which the solutioncontaining one or more crystals of the target macromolecule is present.

‘Just-in-time’ synthesis involves synthesizing the collection ofcompounds remotely from the solution which contains the one or morecrystals of the target protein, and then transferring the synthesizedcollection into the solution containing the one or more crystals of thetarget protein. No purification and/or characterisation of thesynthesized collection is carried out. The synthesis may take place in asolvent which is not compatible with the macromolecule crystals, fromwhich the collection of compounds must be separated in order to add themto the solution containing the macromolecule crystals.

Accordingly, a second aspect of the present invention provides a methodfor identifying a ligand of a target macromolecule comprising the stepsof:

-   -   a) synthesizing a collection of unpurified compounds suitable        for screening against a target macromolecule    -   b) adding the collection of compounds to a solution containing        one or more crystals of the target macromolecule;    -   c) obtaining an X-ray crystal diffraction pattern of the soaked        macromolecule crystal; and    -   d) using said X-ray crystal diffraction pattern to identify any        compound bound to the macromolecule crystal, said compound being        a ligand of the target macromolecule.

In this method, the synthesis of the collection of unpurified compoundsmay occur in one or more reaction vessels.

If step a) takes place in a solvent which is not compatible with themacromolecule crystals then after step a) the collection of compounds isseparated from the solvent in which the compounds were synthesised.

Typically, such non-compatible solvents are organic and the step ofseparating the collection of compounds from this solvent is usuallycarried out by evaporating the solvent. This is then followed byre-dissolution of the collection of compounds in the solution containingthe one or more macromolecule crystals.

Enzyme catalysis in organic solvents has attracted much interest inrecent years (Mattos C. and Ringe D., Curr Opin Struct Biol., 11(6),761–4) and the use of enzymes in non-aqueous media has extended thefield of biocatalysis (ASGSB Bull., 4(2), 125–132 (1991)). Much work hasbeen done to map out organic binding sites in crystals by soaking thecrystals in organic solvents (English, A. C., et al., Proteins, 37,628–640 (1999); Mattos C. and Ringe D., Nauret Biotechnol., 14(5), 595–9(1996)). In addition it has been shown that enzyme crystals can retainactivity in organic solvents, both in the presence and absence ofcrosslinking agents (Ayala, M., et al., Biochem. Biophys. Res. Comm.,295(4), 828–31 9 (2002)).

An alternative approach is to separate the non-compatible solvent fromthe solution containing the one or more macromolecule crystals by apermeable membrane, which allows transfer of the compounds in thecollection from the non-compatible solvent to the solution containingthe one or more macromolecule crystals. This approach requires amembrane which is porous enough to allow the diffusion of thesynthesised compounds from the solvent in which they were synthesised tothe solution containing the one or more macromolecules, whilstsubstantially preventing any diffusion of the solvents. Dialysis buttonsprovide one means by which this can occur, and are available, forexample, from Hampton Research. Their use is described in‘Crystallisation of Nucleic Acids and Proteins, edited by Ducruix, A.and Giege, R., The Practical Approach Series, Oxford University Press,1992.

The ligands identified by the methods of the present invention may besubsequently modified to alter their binding to the target macromoleculeor to improve their usefulness as a pharmaceutical. Such modification isconventional in the art. Possible modifications include: substitution orremoval of groups containing residues which interact with the targetmacromolecule, for example groups which interact with the amino acidside chain groups of a protein; the addition or removal of groups inorder to decrease or increase the charge of a group in a compound; thereplacement of a charge group with a group of the opposite charge; orthe replacement of a hydrophobic group with a hydrophilic group or viceversa. Additionally, a group may be replaced with another retainingsimilar properties but that better occupies the cavity in themacromolecule increasing the surface of the ligand in contact with themacromolecule cavity. This may be achieved using the methodologiesdisclosed in this invention, or by conventional synthetic approachestypically utilised by those skilled in the art of medicinal chemistry.Many of these changes will improve the usefulness of a compounds as apharmaceutical. It will be understood that these are only examples ofthe type of substitutions considered by medicinal chemists in thedevelopment of new pharmaceutical compounds and other modifications maybe made, depending upon the nature of the starting compound and itsactivity.

Without wishing to be bound by theory, the detection of the ligand boundto the target macromolecule relies on the occupancy in the macromoleculecrystals of one of the highest affinity ligands, this being driven byligand-macromolecule interactions.

This method avoids disadvantages associated with biological screeningmethods in which the alteration of macromolecule activity by a potentialligand is assessed, as in that case compounds which bind weakly butnon-specifically can alter macromolecule activity in a non-selectivemanner. Such non-selective inhibition produces false-positives, as theassay shows protein activity inhibition, but the compound would performno useful function as a drug, as it would interfere with the activity ofother proteins. Only compounds bound in a binding site will be detectedby the present method. In particular, only compounds bound in a bindingsite with resolvable occupancy will be detected by the present method.

Binding sites are sites within a macromolecule, or on its surface, atwhich ligands can bind. Examples are the catalytic or active site of anenzyme (the site on an enzyme at which the amino acid residues involvedin catalysing the enzymatic reaction are located), allosteric bindingsites (ligand binding sites distinct from the catalytic site, but whichcan modulate enzymatic activity upon ligand binding), cofactor bindingsites (sites involved in binding/co-ordinating cofactors e.g. metalions), or substrate binding sites (the ligand binding sites on a proteinat which the substrates for the enzymatic reaction bind). There are alsosites of protein-protein interaction. If the macromolecule is a nucleicacid, then binding sites may be the bases of the nucleic acid, or spacesin their structures, e.g. the major or minor grooves in the helical DNA,interactions with phosphate, ribose or deoxy ribose groups orintercalated between the bases.

The present method also enables screening where the target macromoleculehas more than one active site, as the data for each site can be analysedindependently from the other sites to determine the compound bound inthat site. In such cases, the information the method of the inventionprovides on the binding of two or more separate ligands to the targetmacromolecule can be used in the linked-fragment approach to drugdesign, in a similar manner to the method described by Greer, et al., J.Med. Chem., 37(8), 1035–1054 (1994) for the synthesis of a thymidylatesynthase inhibitor series. The basic concept behind linked-fragmentapproaches to drug design is to determine (computationally orexperimentally) the binding location of plural ligands to a targetmolecule, and then to construct a molecular scaffold to connect theligands together in such a way that their relative binding positions ispreserved. The methods of synthesis and screening of the presentinvention may then be used to determine the best ligands from a libraryof such compounds, or individual compounds binding ability can beassessed using known methods.

Even if the present invention only provides information on the bindingof ligands at a single binding site of a target macromolecule, astructure-based approach can be used to develop ligands which interactwith further binding sites. Such a fragment growing approach isdescribed in Blundell, T., et al., Nature Reviews Drug Discovery, vol.11, 45–54 (2002).

It is preferred that the members of the collection of compounds arepresent at a concentration of at least 5 to 50 times, typically at least10, their K_(i) (depending case by case with the macromolecule used) sothat the occupation of the binding site in the target macromolecule willnot depend on the relative quantities of each compound in thecollection.

In the case of competitive binding of a ligand to a macromolecule K_(i)is defined as:

$K_{i} = \frac{\lbrack M\rbrack\lbrack L\rbrack}{\left\lbrack {M\; L} \right\rbrack}$where [M] is the concentration of the macromolecule, [L] is theconcentration of free ligand, [ML] is the concentration of theligand-macromolecule complex.

Where inhibitors are binding in an uncompetitive or non-competitivefashion the K_(i) is defined as in Fundamentals of Enzyme Kinetics by A.Cornish-Bowden, Portland Press, 1995, ISBN 1 85579 0720, which is hereinincorporated by reference.

It is preferred that the amount of each compound, being a member of thecollection of compounds, present in the solution will be present at aconcentration which is at least 5 or 10 times as much as theconcentration of the target macromolecule in the reaction system, morepreferably 100, 1 000 or even 10 000 times the concentration of thetarget macromolecule in the reaction system.

The binding of the ligands to the target macromolecule may be throughnon-covalent interactions or covalent bonding. If the targetmacromolecule is a protein, then covalent binding of the ligand to theprotein may occur when the active site of the protein contains acatalytic residue such as in serine and cysteine proteases. If thetarget macromolecule is a nucleic acid, then certain classes ofcompounds are known to interact by covalent binding, e.g.pyrrolobenzodiazepines covalently bind to the exocyclic amino group ofguanine.

In some embodiments of the present invention, it is preferred that themembers of the collection of compounds do not bind covalently to thetarget macromolecule, but that they interact through non-covalentbinding.

Further aspects of the invention relate to any novel compounds disclosedherein, their use as pharmaceuticals and their use in methods oftherapy. In particular, further aspects of the invention include:

-   a) a ligand identified by the method of the present invention, or    salts, solvates and chemically protected forms thereof;-   b) a pharmaceutical composition comprising a ligand identified by    the method of the present invention, or salts, solvates and    chemically protected forms thereof, and a pharmaceutically    acceptable carrier or diluent;-   c) the use of a ligand identified by the method of the present    invention, or salts, solvates and protected forms thereof, in a    method of treatment of the human or animal body;-   d) the use of a ligand identified by the method of the present    invention, or salts, solvates and protected forms thereof, in the    manufacture of a medicament for the treatment of a disease    ameliorated by the binding of a ligand to the target macromolecule    used in the method of the invention; and-   e) a method for the treatment of a disease ameliorated by the    binding of a ligand to the target macromolecule used in the method    of the invention comprising administering to a subject suffering    from said disease a therapeutically-effective amount of a ligand    identified by the method of the present invention, or salts and    solvates.

Suitable carriers and diluents and information on pharmaceuticalcompositions can be found in standard pharmaceutical texts, for example,Handbook of Pharmaceutical Additives, 2nd Edition (eds. M. Ash and I.Ash), 2001 (Synapse Information Resources, Inc., Endicott, New York,USA); Remington's Pharmaceutical Sciences, 20th Edition, pub.Lippincott, Williams & Wilkins, 2000; and Handbook of PharmaceuticalExcipients, 2nd edition, 1994.

Further details of the invention will now be presented by way ofexplanation and example.

Although the discussion below focuses on the purification, crystalgrowth, X-ray crystallography and determination of ligand structure whenthe macromolecule is a protein, the techniques described are, ingeneral, applicable to other macromolecules, such as nucleic acids andcomplexes, with appropriate modifications as known to the person skilledin the art.

Target Protein Purification.

A specific target protein can be isolated from animal, plant, orbacterial sources directly, or via recombinant methods. The generationof recombinant protein, using systems such as insect cells (e.g. S.frugiperda, or Drosophila cells), E. coli, yeast (S. cerevisiae, S.pombe, P. Pastoris, etc) or modified human cell lines, means thattruncated, or otherwise genetically engineered, proteins can begenerated. A protein crystallography project to obtain crystals normallynecessitates access to a recombinant protein production system, but themethod of the present invention may be performed with a single crystal,which may constitute, for example, between 0.1 and 100 μg.

It is generally accepted that the higher the degree of purity andhomogeneity of a protein preparation the easier that it will be to growprotein crystals from the preparation. Protein purity reflects thenumber of protein species within a preparation. It also refers to thenumber, and nature, of any other non-protein species present (e.g. lowmolecular weight contaminants). An ideal protein preparation shouldcontain solely one protein species, or one species of protein complex,in which all the protein molecules, or protein complexes, are identicalin terms of their amino acid composition, mass etc. The purity of aprotein preparation may be gauged via a variety of experimentaltechniques such as sodium dodecyl sulphate page (SDS-page) gels, massspectrometry, antibody binding and detection (Western blotting), etc.Protein purities in excess of 90% are often deemed acceptable forcrystallisation trials, but practitioners of the art of proteinpurification will generally try and strive for purities in excess ofthis arbitrary threshold, due to the perceived benefits of maximisingprotein purities.

Within a protein preparation, homogeneity can refer to the degree ofuniformity observed for parameters such as the stoichiometry of proteinsin a multiprotein complex, the mono-dispersity of the protein/complexesin solution, the oxidation, or protonation, state of amino-acid sidechains, within proteins, the uniformity of post translationalmodifications (e.g. are all protein molecules within the populationequivalently phosphorolated, glycosylated, or have any essentialco-factors been uniformly and correctly incorporated) and the proteinconformations that exist within a given population of proteinmolecules/complexes. The homogeneity of a protein preparation may beprobed using a multitude of experimental methods some of which are:mass-spectrometry, Western blotting, SDS-page, analyticalultracentrifugation, size-exclusion chromatography, affinitychromatography, ion-exchange chromatography, hydrophobic interactionchromatography, surface plasmon resonance, activity assay, electronmicroscopy, dynamic light scattering (DLS), N-terminal sequencing,iso-electric focussing (IEF), proteolytic digest, fluorescence, circulardichroism (CD), native gel electrophoresis, bandshift assays, or nuclearmagnetic resonance (NMR). Maximising the degree of homogeneity within aprotein preparation is again deemed desirable, as maximising homogeneityis also believed to positively correlate with maximisingcrystallisability.

The Growth of Protein Crystals

Crystallisation of any species requires the formation of asupersaturated solution of the species in question and a nucleationevent that is capable of initiating crystal growth. Post-nucleation theambient conditions must be such that crystal growth can be sustaineduntil the physical dimensions and properties of the crystals thusobtained are adequate for any subsequent experimental proceduresrequired. Protein molecules typically only retain their structuralintegrity within an aqueous environment. Therefore protein crystals arenormally grown in the aqueous phase. Protein crystals may grow if anucleation event occurs in a pure and homogeneous protein solution thathas been driven to a state of super-saturation.

Protein crystallisation is generally attempted using the vapourdiffusion (sitting drop, hanging drop, sandwich drop, pH gradient etc),dialysis, batch, micro-batch, liquid-liquid diffusion, orin-gel-crystallisation methods. All these methodologies have beenextensively described (Protein Crystallisation: Techniques, Strategiesand Tips, Edited by T. M. Bergfors, IUL Biotechnology Series, 1999,Published by International University Line, La Jolla, Calif., ISBN:0-9636817-5-3). All of the afore mentioned crystallisation processesfunction by generating a supersaturated protein solution, which promotesthe spontaneous formation of crystallisation nuclei, and which is thensubsequently able to sustain crystal growth.

There are diverse physico-chemical parameters that can influence whetheror not a protein construct, or protein complex, will crystallise.Typically each protein crystallizes under a unique set of conditions,which cannot be predicted in advance. Simply driving the proteinconcentration to super-saturation, to bring it out of solution, willgenerally not work. The result would, in most cases, be an amorphousprecipitate. Some parameters that may be varied are: the pH ofsolutions, the choice and concentration of buffer (if any) (e.g.Phosphate, MES, BIS-TRIS, TRIS, BES, PIPES, HEPES, MOPS, BICINE, CHES,CAPS etc), temperature, choice of crystallisation method (see above),volume of crystallisation, protein concentration, the addition ofreducing agents (e.g. DTT, β-mercaptoethanol), detergents (e.g.decyl-β-D-maltoside, dodecyl-β-D-maltoside, ocytl-β-D-glucopyranoside,decanoyl-N-methylglucamide, Triton, octyltetraoxyethylene ether, etc.),alcohols (e.g. ethanol, isopropanol, methanol, 2-methyl-2,4-pentanediol(MPD)), salts (e.g. chlorides, acetates, sulphates, phosphates,bromides, iodides, fluorides, nitrates, bicarbonates, chlorates,chromates, citrates, tartrates, cacodylates, formates, hydroxides,.etc.), polyethylene glycols (PEGS), ethylene glycols, methoxypolyethylene glycols (MPEGS), heavy atoms and ions (e.g. iron, copper,zinc, cobalt, manganese, nickel, tungstates, vanadates, sodium,magnesium, potassium, lithium, calcium, aluminium, Xenon, etc.), orother additives such as dimethylsulfoxide (DMSO), denaturants (e.g.urea, guanadinium chloride, etc.), glycerol, sulfabetaines, jeffamines,AMPPNP, ATP, ADP, GTP, GDP peptides, tertiary-butanol, amino acids,azides, DNAs, RNAs, sugars, lipids, drugs, etc.

There are numerous crystallisation kits available (e.g. from HamptonResearch), which attempt to broadly sample as many parameters incrystallization space as possible. In many cases these general screenshelp to identify a starting point for crystallisations in the form ofcrystalline precipitates and/or rough, or micro-, crystals. Typicallythese crystals are unsuitable for direct diffraction analysis andrequire further optimisation. Successful crystallization can be aided byknowledge of a protein's behaviour in terms of solubility, dependence onmetal ions for correct folding or activity, interactions with othermolecules and any other data that are available.

Systematically screening such a large number of parameters represents anextremely complex multi-dimensional search problem and is, as such,exceptionally difficult to perform in a systematic manner. Even with theadvent of automated protein crystallisation it is often the case thatcrystallisation of a protein will require a very high degree of humaninput and the impact of intangible parameters such as serendipity,insight, and random error.

If preliminary crystals have been obtained it is often necessary tofurther modify the crystallisation conditions in an attempt tosimultaneously maximise the internal order and physical dimensions ofthe crystals grown. Optimising these parameters is deemed beneficial forhelping to maximise the data quality obtained in subsequent X-raydiffraction experiments. Identification of a set of initialcrystallisation conditions reduces the potential parameter space thathas to be explored, but crystal optimisation can still remain a timeconsuming and laborious process. Techniques such as macro- ormicro-seeding may also aid crystal optimisation.

Details of some of the proteins crystallised, and information on some ofthe protein crystallisation conditions identified, are contained, forexample, within the following internet databases:

-   http://wwwbmcd.nist.gov:8080/bmcd/bmcd.html (Gilland, G. L., et al.,    Acta Crystallogr., D50, 408–413 (1994));-   http://xray.bmc.uu.se/embo/structdb/links.html;-   http://www.mpibp-frankfurt.mpg.de/michel/public/memprotstruct.html;-   http://www.rcsb.org/pdb/ (Berman, H. M., et al., Nucleic Acids    Research, 28, 235–242 (2000));-   http://www.ebi.ac.uk/msd/ and have been described in the following    publications:

Blundell, T., et al., “Protein Crystallography”, Academic Press, NewYork (1976); McPherson, et al., “Preparation and Analysis of ProteinCrystals” in “Preparation and Analysis of Protein Crystals”, John Wiley& Sons, New York (1982); Carter, et al., “Design of crystallizationexperiments and protocols.”, pages 47–71 in “Crystallization of NucleicAcids and Proteins—A Practical Approach”, (Ducruix, A. & Giege, R., eds)IRL Press, Oxford (1992); Ducruix, A., et al., “Methods ofcrystallization.”, pages 73–98 in “Crystallization of Nucleic Acids andProteins—A Practical Approach”, (Ducruix, A. & Giege, R., eds) IRLPress, Oxford (1992); “Protein Crystallisation: techniques, stratagies,and tips.” IUL Biotechnology Series (1999), ISBN 0-9636817-5-3.

Obtaining X-ray Diffraction Data from Soaked Protein Crystals

An X-ray diffraction experiment consists of exposing a protein crystalto a collimated, coherent, beam of X-rays and recording the resultingX-ray diffraction pattern. A diffraction pattern arises from the elasticscatter of X-rays off electrons within the planes of atoms within aprotein crystal. The mathematics underlying X-ray diffraction may berepresented in their simplest form by the Bragg equation:nλ=2d sin θwhere n is an integer, λ is the wavelength of the incident X-rays, θ isthe scattering angle of the X-rays off a given plane of atoms, and d isthe Bragg spacing, or spacing between successive planes of atoms (Braggplanes). Thus as long as λ corresponds to atomic scales (i.e. ≈1 Å) thenatomic scale features should be discernable within an electron densitymap calculated using the diffraction data. Typically it is consideredthat X-ray data are required to a Bragg spacing of ≦3 Å, for a givenX-ray wavelength, if the data from an X-ray diffraction experiment areto be of use in determining the atomic positions within a structure.During an experiment the various Bragg planes within a crystal will onlysatisfy the mathematical criteria necessary for diffraction at specificcrystal orientations relative to the incident X-ray beam. So as toobtain diffraction data relating to as many Bragg planes as possible thecrystal is rotated in the X-ray beam. Thus all possible planeorientations are explored. The angle through which a crystal must berotated in order to obtain a complete set of data relating to a specificBragg spacing is defined by the space group of the crystal and also theinitial orientation of the crystal in the X-ray beam. The smallest Braggspacing for which diffraction data are available is termed theresolution of the experiment. It is desirable to maximise the datacompleteness for an experiment. That is, data should ideally be 100%complete up to the resolution limit of the experiment.

Unfortunately the high energy of X-ray photons means that they causedamage to protein molecules. This is thought to be at least partiallydue to the generation of ions and free radicals within the crystals.Prolonged exposure of a protein containing crystal to an X-ray beam willthus result in a deterioration and decay of the proteins. This istypically manifested by a decline in the data resolution and quality.This problem may be partially circumvented by cryogenically freezingprotein containing crystals in vitreous ice at 100K. Freezing of thecrystal means that any ion, or free radical, species that are generatedare unable to migrate through the crystal. Thus the longevity of thecrystal in the X-ray beam is extended and the data quality andresolution typically improved relative to an unfrozen data collection.Normally protein containing crystals cannot be directly frozen in thesolutions in which they grew (mother liquor). This is because directfreezing often leads to the formation of ice crystals within the motherliquor. These ice crystals can destroy the internal order within aprotein crystal and thus abolish diffraction. The addition of acryo-protectant to the protein's mother liquor can, however, lead to theformation of vitreous ice on freezing. This should not destroy theinternal order of a protein crystal and thus retain diffraction from thecrystal. Normally protein containing crystals must be transferred fromtheir mother liquor into a specially formulated cryo-protectant solutionprior to freezing. The exact composition of the cryo-protectantsolution, the transfer protocol, and the freezing protocol must beuniquely determined for each crystal system and often for eachexperiment.

Determination of Ligand Structure

A mathematical operation termed a Fourier transform relates thediffraction pattern observed from a crystal and the molecular structureof the protein and ligand comprising the crystal (Blundell, T., et al.,“Protein Crystallography”, Academic Press, New York (1976); Drenth,“Principles of Protein X-ray Crystallography”, Springer (1994)). AFourier transform may be considered to be a summation of sine and cosinewaves each with a defined amplitude and phase. Thus, in theory, it ispossible to calculate the electron density associated with a protein andligand structure by carrying out an inverse Fourier transform on thediffraction data. This requires amplitude and phase information to beextracted from the diffraction data. Amplitude information may beobtained by analysing the intensities of the spots within a diffractionpattern. The conventional methods for recording diffraction data do,however, mean that any “phase information” is lost. This phaseinformation must be in some way recovered and the loss of thisinformation represents the “crystallographic phase problem”. The phaseinformation necessary for carrying out the inverse Fourier transform canbe obtained via a variety of methods.

If the structure of the unsoaked protein is already available, as wouldnormally be the case, a set of theoretical amplitudes and phases may becalculated using the protein model and then the theoretical phasescombined with the experimentally derived amplitudes. An electron densitymap may then be calculated and the protein and ligand structureobserved. Electron density maps can be calculated using programs such asthose contained in the CCP4 computing suite (Collaborative ComputationalProject 4, Acta Crystallographica, D50, 760–763 (1994)). For mapvisualization and model building programs such as “O” (Jones, et al.,Acta Crystallographica, A47, 110–119 (1991)) or “QUANTA” (Jones, et al.,(1991) and commercially available from Accelrys, San Diego, Calif.) canbe used.

An alternative approach employs (i) X-ray crystallographic diffractiondata from the complex of ligand and protein and (ii) a three-dimensionalstructure of the unsoaked protein, to generate a difference Fourierelectron density map of the complex. The difference Fourier electrondensity map may then be analysed to identify the ligand.

Analysis of electron density maps may be aided by software, for example,AutoSolve® (Blundell, T., et al., Nature Reviews Drug Discovery, 11,45–54 (2002)) or the ligand fitting module in QUANTA, XLIGAND (QUANTA:see above; X-LIGAND: Oldfield, T. J., Acta Crystallogr D BiolCrystallogr., 57(5), 696–705 (2001)).

If there is no known structure of the protein then alternative methodsfor obtaining phases must be explored so as the resolve the structure ofthe unsoaked protein (Blundell, T., et al., “Protein Crystallography”,Academic Press, New York (1976)). One method is multiple isomorphousreplacement (MIR). This relies on soaking “heavy atom” (i.e. Platinum,Uranium, Mercury, etc) compounds into the crystals and observing howtheir incorporation into the crystals modifies the spot intensitiesobserved in the diffraction pattern. An alternative method for obtainingphase information for a protein of unknown structure is to perform amulti-wavelength anomalous dispersion (MAD) experiment. This relies onthe absorption of X-rays by electrons at certain characteristic X-raywavelengths. Anomalous scattering by atoms within a protein will modifythe diffraction pattern obtained from the protein crystal. Thus if aprotein contains atoms which are capable of anomalous scattering adiffraction dataset (anomalous dataset) may be collected at an X-raywavelength at which this anomalous scattering is maximal. The most usualway to introduce anomalous scatterers into a protein is to replace thesulphur containing methionine amino acid residues with seleniumcontaining seleno-methionine residues. This is done by generatingrecombinant protein that is isolated from cells grown in controlledgrowth media that contains seleno-methionine (Doublie, S., Methods inEnzymology, 276, 523–530 (1997)). Selenium is capable of anomalouslyscattering X-rays and may thus be used for a MAD experiment. Anothermethod generally available for the calculation of the phases necessaryfor the determination of an unknown protein structure is molecularreplacement. This method relies upon the assumption that proteins withsimilar amino acid sequences (primary sequences) will have a similarfold and three-dimensional structure (tertiary structure). Examples ofcomputer programs known in the art for performing molecular replacementare CNX (Brunger, A. T., et al., Current Opinion in Structural Biology,8(5), 606–611 (1998) and also commercially available from Accelerys SanDiego, Calif.) or AMORE (Navaza, J., Acta Cryst., A50, 157–163 (1994)).The phase information obtained by one of these means, when combined withthe experimentally obtained amplitudes from the native dataset, enablesan electron density map of the unknown protein molecule to be calculatedusing the Fourier transform method.

If an electron density map has been calculated for a protein of unknownstructure then the amino acids comprising the protein must be fittedinto the electron density for the protein. This is normally donemanually, although high resolution data may enable automatic modelbuilding. The process of model building and fitting the amino acids tothe electron density can be both a time consuming and laborious process.Once the amino acids have been fitted to the electron density it isnecessary to refine the structure. Refinement attempts to maximise thecorrelation between the experimentally calculated electron density andthe electron density calculated from the protein model built (Blundell,T., et al., “Protein Crystallography”, Academic Press, New York (1976)and “Methods in Enzymology”, vols. 114 & 115, Wyckoff, H. W., et al.,eds., Academic Press (1985)). Refinement also attempts to optimise thegeometry and disposition of the atoms and amino acids within theuser-constructed model of the protein structure. Sometimes manualre-building of the structure will be required to release the structurefrom local energetic minima. There are now several software packagesavailable that enable an experimentalist to carry out refinement of aprotein structure such as CNX (see above), or REFMAC (Murshudov, G. N.,et al., Acta Crystallographica, D53, 240–255 (1997)). There are certaingeometry and correlation diagnostics that are used to monitor theprogress of a refinement. These diagnostic parameters are monitored andrebuilding/refinement continued until the experimenter is satisfied thatthe structure has been adequately refined.

The atomic coordinate data of the co-complexes formed from the methodsof the invention can be routinely accessed using computer programs, forexample, RASMOL (Sayle, et al., TIBS, 20, 374 (1995)), which is apublicly available computer software package, which allows access andanalysis of atomic coordinate data for structure determination and/orrational drug design or AstexViewer™ which is contained in the CCP4computing suite.

Information from X-ray Crystallography

The information obtained by the method according to the invention can beused to provide much more information that the identity of the ligandwhich has bound. As the information on the ligand results from fittingto the electron density measured in the protein active site, this canallow the mode of binding and interactions to be ascertained, which canbe useful in further elaboration and optimisation of the ligand.

The Collections of Compounds

The collection of compounds for use in the present invention areproduced by one or more synthetic processes designed to connect two ormore sets of monomers together. Monomers are molecules which share acommon reactivity that makes them capable of combining with anothercomplementary monomer to form a larger compound. Each set of monomerswill contain at least one monomer and preferably no more than 100monomers, such that between about 5 and about 1000 compounds are in eachcollection of compounds. It is preferred that all the compounds in acollection comprise a common functional group (sometimes referred to asa ‘templating moiety’) which is produced by the reaction of two or morecomplimentary functional groups present on the monomer sets used in thesynthetic process. In one preferred embodiment each compound in acollection is related to other members of the collection by virtue ofbeing synthesised from at least one common monomer unit. By havingcommon features or trends in the structures of the compounds in thecollection, it makes it possible to identify which moieties in thecompounds (derived from individual monomer units) bind best to thetarget protein, even if the independent monomer itself does not have anydetectable binding. This is achieved by observing the preference forbinding that the macromolecule exhibits for compounds from thecollection.

It is further preferred that the collection of compounds will have shapefeatures that differ sufficiently to allow at least one set of monomersused in their production to be distinguished from each other. This thenallows determination of the chemical structure of the bound ligand, orto at least determine part of its structure. If only part of itsstructure can be determined, re-synthesis of some of the members of thecollection of compound that contain this partial structure will benecessary and these compounds are soaked into the crystal as singletexperiments so that the chemical structure of the bound ligand orligands can be determined. This type of re-synthesis approach is termeda chemical deconvolution and is well known to practitioners ofcombinatorial or parallel synthesis to identify the biologically activemember from a mixture of synthesised compounds.

‘In-situ’ Synthesis

As mentioned above, protein crystals are sensitive to the media in whichthey are grown and kept, and therefore the methods chosen for ‘in-situ’synthesis, i.e. synthesis of the collection of compounds in the presenceof the target protein crystals, have to be chosen such that they canoccur under conditions which will not destroy the target proteincrystal. If some degradation of the crystals occur, this should not beto such an extent that the X-ray diffraction results are not ofsufficient quality to allow for identification of the ligand (seeabove).

Typically reactions are those which can be carried out in a purelyaqueous media at around room temperature (4 to 30° C.), and at a pH of 4to 10.

Examples of these chemistries, which apply to both methods ofcombinatorial synthesis and parallel synthesis, include: Acetal or Ketalformation; Addition reactions; Aldol condensations and relatedcondensation reactions; Allylations; Cycloaddition reactions; Disulfideformation; Hydrazone formation; Mannich reactions; Michael reactions andrelated Conjugate Addition reactions; Palladium mediated reactions;Reductive alkylation; Substitution reactions; and Three or FourComponent Reactions.

The following reactions can be carried out using synthetic proceduresdescribed in general organic chemistry texts such as March's AdvancedOrganic Chemistry (Smith, M. B. and March, J., 5^(th) edition,Wiley-Interscience, New York, 2001) or references given therein. Somesynthetic reactions carried out in aqueous conditions have recently beenreviewed by Ulf Lindstroem, see “Stereoselective Organic Reactions inWater”, Chemical Reviews, 102(8), 2751–2771 (2002). The practice ofcombinatorial chemistry is described in references cited in publicationsby Roland Dolle (Journal of Combinatorial Chemistry, 4(5), 369–418(2002); Journal of Combinatorial Chemistry, 3(6), 477–517 (2001);Journal of Combinatorial Chemistry, 2(5), 383–433 (2000); MolecularDiversity, 4(4), 233–256 (2000); Journal of Combinatorial Chemistry,1(4), 235–282 (1999); Molecular Diversity, 3(4), 199–233 (1998)).

These reactions take place between appropriate sets of ligand precursormolecules (‘monomers’), which are represented schematically below, withM representing a substituent group that is varied in each monomer set(e.g. M₁, M₂, M₃, etc.). Each reacting set of monomers can have as fewas a single member, up to, for example, 40 members, although a maximumof about 20 members would be more usual. In some embodiments of theinvention it is preferred that each set of reacting monomers comprisesat least two monomers. A typical size for the resulting collection ofcompounds would be between 5 and 1000, preferably between 5 and 100,with a range 10 or 20 to 50 or 70 being preferred.

In the following examples, R represents a substituent or H on a monomer.

Acetal or Ketal Formation

Addition of alcohols or a diol to an aldehyde or ketone, catalysed byacid. The reactions are fully reversible leading to thermodynamicproducts.

Addition Reactions

For example, addition to an epoxide under acidic conditions, or additionof alcohols to enol ethers.

X═O, S, N or P, and where the ring may be optionally substituted on anyavailable position.

Aldol Condensations and Related Condensation Reactions

For examples, see Kobayashi, S. and Manabe, K., Accounts of ChemicalResearch, 35(4), 209–217 (2002).

Related condensation reactions are the Knoevenagel reaction, thePeterson reaction, the Perkin reaction, the Darzen's reaction, Tollens'reaction, the Wittig reaction and the Thorpe reaction. Careful selectionof the monomers is required in order for these reactions to proceedunder aqueous conditions.

Allylations

For examples, see Kobayashi, S. and Hachiya, I., Yuki Gosei KagakuKyokaishi, 53(5), 370–80 (1995).

Cycloaddition Reactions

For example, the Diels-Alder reaction, see Fringuelli, F., et al.,European Journal of organic Chemistry, 3, 439–455 (2001).

Many variations exist of the above formula, including where heteroatomsare incorporated (e.g. aza-Diels-Alder reactions) Cycloadditions to form5 membered ring systems are also very general and an illustrativeexample is the cycloaddition of nitrile oxides with alkynes to formoxazoles which occurs at room temperature under very mild conditions.

An example of a cycloaddition reaction that is water-tolerant is the“click chemistry” described by Sharpless in Lewis, et al., Angew. Chem.Int. Ed., 41(6), 1053–1057 (2002). In this an azide and an acetyleneundergo a Huisgen 1,3-dipolar cycloaddition to give 1,2,3-triazoles.

Disulfide Formation

Occurs reversibly under very mild conditions.

Hydrazone Formation

Occurs reversibly under very mild conditions.

Mannich Reactions

For examples, see Akiyama, T., et al., Advanced Synthesis & Catalysis,344(3+4), 338–347 (2002).

Michael Reactions and Related Conjugate Addition Reactions

Addition of nucleophiles to α,β-unsaturated carbonyl compounds isanother example of addition reactions suitable for use in the invention.

The carbonyl can also be replaced by other electron withdrawing groups,such as nitro groups, see Da Silva, F. and Jones, J., Journal of theBrazilian Chemical Society, 12(2), 135–137 (2001).

A related transformation is the Baylis Hillman reaction, see Yu, C., etal., Journal of Organic Chemistry, 66(16), 5413–5418 (2001).

Palladium Mediated Reactions

Many palladium mediated reactions can be carried out in aqueous media,e.g. Heck, Sonogashira, Tsuji-Trost, Suzuki, Stille, see Pierre Genet,J. and Savignac, M., Journal of Organometallic Chemistry, 576(1–2),305–317 (1999). A representative illustration is the Suzuki crosscoupling reaction:

Reductive Alkylation

Occurs under very mild conditions. An example carried out in thepresence of a protein is Hochguertel, M., et al., Proceedings of theNational Academy of Sciences of the United States of America, 99(6),3382–3387 (2002).

Substitution Reactions

Many useful substitution reactions occur under aqueous conditions, e.g.nucleophilic displacement (with alcohols, amines, thiols, carboxylicacids, enolates, hydrazines, dithianes etc) of alkyl halides, tosylates,mesylates and azides; ester, amide and urea formation by displacement ofan activated ester or carbonate or carbamate; and aromatic nucleophilicsubstitution of electron deficient aromatic compounds with amines,alcohols, thiols etc.

An example of alkylation chemistry in the presence of a protein isNguyen, R. and Huc, I., Angew. Chem. Int. Ed., 40(9), 1774–6 (2001)

A generic scheme is as follows:

Where LG is a leaving group, and X is a nucleophilic heteroatom orcarbon anion.

Three or Four Component Reactions

A number of multicomponent reactions proceed under mild mixed aqueousconditions and are suitable for combinatorial library design for thepurposes of this invention. One example is the Ugi condensation (seeDomling, A., Current Opinion in Chemical Biology, 6(3), 306–313 (2002);Ugi, I., et al., Combinatorial Chemistry, 125–165 (1999)):

Also encompassed in the scope of the invention is design ofcombinatorial reactions in which more than one functional group can bepresent on any given monomer so that multimeric ligands can beassembled. In this way two or more monomers can be assembled by two ormore functional group interconversions using chemistry illustrated aboveor other chemistry possible under mild aqueous conditions. For example,schematically:

in which the monomer sets containing groups M₁ and M₂ react togetherthrough one chemistry (A+B=X) to give trimeric products containing thegroups M₁ and M₂, and monomer sets containing M₁, M₂ and M₃ reacttogether through two chemistries (A+B=X and P+Q=Y) to give trimericproducts containing the groups Ml and M₂ and M₃.

Other factors which need to be considered in designing suitablecombinatorial library conditions under aqueous conditions are asfollows:

Solubility in water of the reacting monomers may be limiting to theefficiency of the transformations

Catalysis—Bronsted and Lewis acid catalysis and other catalysts may beused to allow a transformation to proceed in an aqueous environment [Forexample Lindstroem, U., Chemical Reviews, 102(8), 2751–2771 (2002).]

Micelles—the use of micellar catalysts to enable the use of water as asolvent [For examples see Lindstroem, U. (2002)]

Cosolvents—the use of cosolvents to enable the use of water as a solventincluding, but not limited to, DMSO, polyethylene glycols, ethyleneglycols, methanol, ethanol, isopropanol, acetone and acetonitile. Thesecosolvents should be compatible with the protein crystal and maytypically be used in an amount of up to 20% of the solvent system, whichis preferred, although an amount of up to 40% or higher may be possible.

Solubilisers—the use of solubilising agents to enable the use of wateras a solvent in reactions of organic compounds. [For examples seeLindstroem, U., (2002)]

Surfactants—the use of surfactants to enable the use of water as asolvent in reactions of organic compounds. [For examples see Lindstroem,U., (2002)] These surfactants should be compatible with the proteincrystal.

The above described combinatorial library synthesis and procedures cansimilarly be adapted for mixed aqueous solvent conditions.

In some embodiments of the present invention, it is preferred that themonomers cannot substantially bind to the target protein, but that onlythe collection of compounds formed include compounds that show affinityfor the target protein. Preferably the ratio of the K_(i) of thestrongest binding ligand formed to the K_(i) of the strongest bindingmonomer is at least 10 to 1, more preferably 100, 1000 or even 10000 to1.

In a preferred case, it is possible that the presence of the proteincrystal in the solution in which the combinatorial chemistry takes placewill exert an influence on the reactions occurring. If the reactions arereversible, then without wishing to be bound by theory, this would allowgeneration of thermodynamic products having the advantage of allowingfor the ‘local enrichment’ of products at the protein surface leading toformation of the most potent ligand possible in the library. Theprinciples of this effect are described by Huc, I. and Nguyen, R.,Combinatorial Chemistry and High Throughput Screening, 109–130 (2001).If the reactions are irreversible, then without wishing to be bound bytheory, this would allow generation of kinetic products having theadvantage of ‘local templating’ of products at the protein surfaceleading to formation of the most potent ligand possible in the library.The principles of this effect are described by Nguyen, R. and Huc, I.,Angew. Chem. Int. Ed., 40(9), 1774–6 (2001) and in ‘click chemistry’e.g. Lewis, W., et al., Angew. Chem. Int. Ed., 41(6), 1053–1057 (2002).

‘Just-in-time’ Synthesis

As ‘just-in-time’ synthesis is not carried out in the presence of theprotein crystals, there is less restriction on the types of chemistriesthat can be used to generate the collections of compounds for screening.However, in selecting appropriate starting materials and reactionconditions, it is preferred that the reactions do not result in a largenumber of by-products that could interfere with the screening process,and so reactions that do not require extraneous reagents are preferred.

The reactions discussed above in relation to the ‘in-situ’ synthesiswould be particularly suitable, but other reactions could be considered,for example those using methodology in which the reagents that catalyseor drive the synthetic conversions are bound onto a solid phase mediumand therefore removed from the solution by filtration. Thesechemistries, suitable for use with this invention, are described by Ley,S. V., et al., Perkin 1, 23, 3815–4195 (2000) and references citedtherein.

Such reactions can be carried out using synthetic procedures describedin general organic chemistry texts such as March's Advanced OrganicChemistry (Smith, M. B. and March, J., 5^(th) edition,Wiley-Interscience, New York, 2001) or references given therein, andreference is also made to the texts on the practice of combinatorialchemistry given above.

As in ‘in-situ’ synthesis, the reactions take place between appropriatesets of ligand precursor molecules (‘monomers’), in which at least onesubstituent group is varied, to result in sets of monomers. Eachreacting set of monomers can have as few as a single member, up to, forexample, 40 members, although a maximum of about 20 members would bemore usual. In some embodiments of the invention it is preferred thateach set of reacting monomers comprises at least two monomers. A typicalsize for the resulting collection of compounds would be between 5 and1000, preferably between 5 and 100, with a range 10 or 20 to 50 or 70being preferred.

The usual purification and characterisation steps which are used in thepractice of combinatorial or parallel synthesis are not required inmethods according to the present invention. These steps are viewed asessential in the conventional practice of these synthesis methods inorder to produce compounds suitable for testing in a biological assay,as described in many of the references cited herein. Purification doesnot involve physical separation techniques such as solvent evaporationor removal of insolubles, e.g. by sedimentation, centrifugation orfiltration. Conventional purification methods include aqueousextraction, trituration, chromatography such as flash columnchromatography or HPLC purification, crystallisation and distillation,although certain characterisation methods can also be used inpurification. Characterisation may be carried out in a number of ways,including using LCMS, MS and NMR analysis.

In a particularly preferred embodiment of the invention, the chemistryused for the ‘just-in-time’ synthesis is carried out in a solventsuitable as a co-solvent for aqueous solutions of protein crystals. Suchsolvents include DMSO, NMP and alcohols such as methanol and ethanol(for more details, see discussion of cosolvents in ‘in-situ’ synthesis).In this way, after incubation of the reaction for a suitable period oftime, aliquots can be taken of the solution and added directly to theprotein crystal containing solution, without the need for anypurification or characterisation steps.

EXAMPLE

The protein chosen as the target macromolecule was cyclin-dependentkinase 2 (CDK2). This target has been the subject of intense study withthe aim of developing inhibitors for the treatment of a number of humancancers, and has been crystallized with a number of inhibitors bound inthe ATP-binding groove (De Azevedo, W. F., et al., Eur. J. Biochem.,243, 518–526 (1997); Hoessel, R., et al., Nat. Cell Biol., 1, 60–67(1999))

The collection of compounds chosen for synthesis were based on theoxindole template, being a class of inhibitors already disclosed forCDK2 (Bramson, H. N., et al., J. Med Chem., 44, 4339–4358 (2001)). Theseligands (AB; Scheme 1) present substituents in adjacent lipophilicbinding pockets within the ATP binding groove and can be disconnected tomonomers of approximately equal size and complexity (hydrazines A andisatins B; Scheme 1).

A range of hydrazines (A1 to A6; Table 1) and isatins (B1 to B5;Table 1) were chosen, so that the collection of oxindoles formed wouldpresent a range of functional groups to the ATP binding site.

TABLE 1 Hydrazines Isatins A1

B1

A2

B2

A3

B3

A4

B4

A5

B5

A6

The synthesis of the collection of ligands was then demonstrated toproceed under aqueous conditions in the presence of 20% of theco-solvent dimethylsulfoxide (DMSO), according to scheme 2:

Analysis by mass spectrometry of the 30 reactions in the array indicatedthat all reactions successfully formed the expected products over a24–72 hour period and the efficiency of the individual reactions wasassessed by LC/MS analysis (Table 2). In most cases, under thesereaction conditions, the products slowly precipitated from the reactionsolution over time. It is clear from the qualitative data presented inTable 2 that monomer A6 did not give high conversions or high purity inthe reactions and that some other individual reactions were also poor,however, the variability in absolute amount of products formed betweenindividual reactions would be expected to be no more than 10-fold fromthese results.

TABLE 2 (% purity by peak area of product by LC/MS) A6 A5 R¹ = A2 A3 A4R³ = Cl; R³ = A × B A1 R¹ = Cl R² = Cl R³ = Cl SO₂NH₂ SO₂Me B1 10–2560–95 60–95 30–50 60–95 30–50 R⁵ = NO₂ B2 60–95 60–95 60–95 60–95 60–9530–50 R⁵ = Cl B3 10–25 60–95 30–50 10–25 60–95 30–50 R⁵ = SO₃H B4 30–5060–95 60–95 60–95 60–95 30–50 R⁷ = CF₃ B5 30–50 60–95 60–95 60–95 60–9510–25 R⁵ = OCF₃ R groups = H unless indicated otherwise.

Monomer studies to investigate kinetic competition of the monomersindicated that the products formed non-stoichiometric mixtures since itwould be expected that the hydrazines and isatins would have varyingreactivities. LC/MS analysis of competition experiments (usingPhotodioide Array Detector scanning from 200–400 nmwavelengths)indicated that mixtures of the hydrazines reacted with a10-fold deficit of each isatin gave less than a 5-fold excess of anyindividual product over any other and typically less than 3-fold excess.These data again indicated that monomer A6 tended to be disfavored butagain a measurable amount of product resulting from reaction of thismonomer with each isatin was detected (5–10% of the mixture).

These results show that during ligand synthesis in the presence of theprotein all of the reaction products were capable of being formed at auseful concentration. Additionally, under these conditions,thermodynamic products would be expected to predominate because thereactions are fully reversible.

Analysis of the reactions, e.g. by LC-MS and isolation of the compoundsis not required for the method of the invention, but having nowestablished reaction conditions suitable for use in the X-ray screeningmethod disclosed in this invention, it is now possible to change thesubstitution patterns present in the monomer sets A and B (i.e. R¹—R⁷)and to carry out in situ synthesis of further libraries of compounds,not described herein, in the presence of CDK2 crystals.

Crystals of full length (residues 1–298), human, cdk2 were grown. Forsoaking purposes crystals were transferred into a solution thatmaintained the ionic strength and precipitant concentrations of theoriginal crystal mother liquor, but also contained 20% DMSO and thehydrazine and isatin reactive species (see Table 3). In all cases thetotal hydrazine concentration was in 10-fold excess over the isatinconcentration.

The present invention also includes the use of the aforementionedmethods for the generation of CDK2 ligands.

Experimental Methods

Crystals of full length (amino acids 1–298) human cyclin dependentkinase 2 (cdk2) were grown under the conditions detailed in (1) Lawrie,et al., Nat. Str. Biol., 4, 796–800 (1997) and (2) Rosenblatt, J., etal., J. Mol. Biol., 230, 1317–1319 (1993). Crystals grown using (1) wereobtained using the hanging drop, vapour diffusion method, at 4° C., or18° C. 1 μl of 10 mg/ml cdk2 in 10 mM Hepes/NaOH pH 7.4, 15 mM NaCl, wasmixed with 1 μl of reservoir solution. The reservoir solutions (1 mltotal volume) contained (25–55) mM ammonium acetate, (10–17.5)%polyethylene glycol (PEG)(average molecular weight 3350) and 100 mMHEPES/NaOH pH 7.4. Crystals grown using (2) were also produced by thehanging drop, vapour diffusion, method at 4° C. 4 μl of 10 mg/ml cdk2 in10 mM HEPES/NaOH pH7.4 were suspended over a 1 ml reservoir solutioncontaining (200–800) mM HEPES/NaOH pH 7.4.

For soaking purposes crystals grown using (1) were transferred intomicrobridges containing 20 μl of soak solution. Soak solutions wereprepared such that the ammonium acetate, PEG and HEPES/NaOH pH 7.4concentrations were identical to the drop from which a given crystal washarvested. The soak solutions further contained 20% DMSO and thehydrazine and isatin species that were to be reacted. Hydrazine andisatin concentrations in the ranges (5–20) mM (total organic) and(0.5–2) mM (total organic) respectively were used. Total organic refersto the fact that in a soak solution containing multiple isatins andhydrazines the combined concentration of all the hydrazines would be(5–20) mM and the combined concentration of all the isatins would be(0.5–2 mM). All hydrazines were present in equimolar concentrations andall isatins were present in equimolar concentrations. That is; if a soaksolution contained two isatins and four hydrazines with total organichydrazine and isatin concentrations of 10 mM and 1 mM respectively eachof the isatins would be present at a concentration of 0.5 mM and each ofthe hydrazines at a concentration of 2.5 mM. Soak solutions typicallycontained permutations of (1–5) isatins and/or (1–6) hydrazines (seeTable 1.). The generic formula for the products formed by permuting thereactants is given in Scheme 2 (see also Table 2.). The exact nature ofthe products formed is detailed in Table 2. Crystals were soaked for 3–5days at 18° C.

Crystals were frozen by momentarily dipping them into a cryo-protectantsolution and then snap cooling them in liquid nitrogen. Thecryo-protectant solutions contained (17.5–22.5)% glycerol and ammoniumacetate, PEG and HEPES/NaOH pH 7.4 concentrations that were identical tothe drops from which the crystals were originally harvested.

Crystals grown using (2) were soaked, and frozen identically to thosegrown using (1), except that rather than maintaining ammonium acetate,PEG and HEPES/NaOH pH 7.4 concentrations only the HEPES/NaOH pH 7.4concentration was maintained.

X-ray diffraction data were collected from soaked crystals, cooled to100K, on a Rigaku copper rotating anode source using Rigaku/MSC JupiterCCD, or Raxis IV++ image plate, detectors. The data were integrated,reduced and scaled using either the D*TREK suite (Pflugrath, J. W., ActaCrystallographica, D55, 1718–1725 (1999)), or MOSFLM (Leslie, A. G. W.In Joint CCP4 and EESF-EACMB Newsletter on Protein Crystallography, vol.26, Warrington, Daresbury Laboratory (1992)), SCALA and the CCP4 suiteof programs (CCP4:see above). An apo-cdk2 structure (DeBondt, H. L., etal., Nature, 363, 595–602 (1993)) was used as a starting point forstructure refinements. The structures were initially segmented into 25amino acid sections and rigid-body refined in CNX (see above). Thestructures were then subjected to iterative cycles of positional andisotropic B-factor refinement in CNX, followed by manual rebuildingusing the graphics program “O” (see above) and automated ligand fittingusing AUTOSOLVE®. Final water fitting was performed using in-housesoftware. Details of representative X-ray data are given in Table 3.

TABLE 3 Representative X-ray results and experimental conditions B2 +(B1, B2, B3, Components B2 + (A1, A2, B4, B5) + in soaking (A1, A2, A3,A4, (A1, A2, A3, solution A5 + B2 A5 + B3 A5 + B5 A5 + B1 A3, A4) A5,A6) A4, A5, A6) Space group P2₁2₁2₁ P2₁2₁2₁ P2₁2₁2₁ P2₁2₁2₁ P2₁2₁2₁P2₁2₁2₁ P2₁2₁2₁ a (Å) 53.68 53.69 53.50 53.73 53.53 53.62 53.69 b (Å)71.94 71.94 71.41 72.11 71.48 71.83 72.41 c (Å) 71.90 72.25 72.12 72.3672.22 71.89 72.10 Maximal 2.20 2.25 2.65 2.70 2.20 2.20 2.80 resolutionObservations 36740 41480 22249 23195 36709 35997 22680 Unique 1426513583 8277 8470 14171 14250 6991 reflections Completeness 97.2 98.5 98.298.6 92.7 89.1 96.9 (%) R_(merge) 0.088 0.153 0.122 0.143 0.057 0.0350.127 Mean I/σI 4.3 4.1 3.8 3.7 6.2 8.6 4.8 Highest 2.28–2.2 2.33–2.252.74–2.65 2.74–2.65 2.28–2.2 2.28–2.2 2.95–2.8 resolution bin (Å)Completeness 88.5 97.8 99.5 100 86.8 85.2 98.0 (%) R_(merge) ¹ 0.280.284 0.332 0.344 0.187 0.134 0.258 Mean I/σI 2.0 2.4 2.0 2.1 2.3 3.02.8 Refinement Protein 2279 2279 2279 2279 2279 2279 2279 atoms Otheratoms Inhibitor 23 26 27 25 0 23 23 Water 139 129 57 39 245 229 67Resolution 35–2.2 35–2.25 35–2.65 35–2.7 35–2.2 35–2.2 35–2.8 range (Å)R_(conv) (%)² 25.9 24.7 23.7 24.3 25.6 24.6 21.1 R_(free) (%)³ 30.6 28.929.0 28.1 27.2 25.5 24.9 Mean B- factor (Å²) Protein 37.0 38.1 42.2 49.026.1 30.8 30.6 Ligand 31.2 37.3 47.0 59.7 NA 24.3 25.5 Solvent 47.2 42.835.5 50.4 34.2 41.5 34.8 Soaks Isatin conc 2⁶ 0.5⁶ 0.5⁶ 1⁶ 1 1 1 (total(mM) organic⁴) Hydrazine 20 5 5 10 10 10 10 (total conc (mM) (total(total organic⁴) organic⁴) organic⁴) Soak time 5 3 3 3 3 3 3 (days)Product YES⁵ YES⁵ YES⁵ YES⁵ NO⁵ YES⁵ YES⁵ present ¹R_(merge) = Σ_(h)Σ_(j) |I_(h, j) − Ī_(h)|/Σ_(h) Σ_(j) |I_(h, j)|, where I_(h, j) is thej^(th) observation of reflection h. ²R_(conv) = Σ_(h) | |F_(o)| −|F_(c)| |/Σ_(h) |F_(o)|, where F_(o) and F_(c) are the observed andcalculated structure factor amplitudes respectively for the reflectionh. ³R_(free) is equivalent to R_(conv), but for a 5% subset of thereflections not used for refinement. ⁴total organic refers to the totalcombined isatin, or hydrazine, concentration in the soak. Each componentbeing present at a concentration equal to the total organicconcentration divided by the number of isatin, or hydrazine, componentsin the mix. ⁵Control experiments with crystals soaked for (3–5) days incocktails containing either B2(20 mM), (B1, B2, B3, B4, B5) (1 mM totalorganic⁴), or (A1, A2, A3, A4, A5, A6) (1 mM total organic) did notreveal any interpretable ligand density. Crystals soaked in a 10 mMcocktail of (A1, A2, A3, A4, A5, A6) were repeatedly destroyed. ⁶Incertain instances it was found that crystals subjected to soaks thatcontained only one hydrazine and one isatin sustained increased damageas compared to those soaked in multi-component isatin-hydrazinecocktails. Isatin and hydrazine concentrations were iteratively reduceduntil soaking crystals could be stabilised, and crystal damagediminished. Precipitation, attributed to isatin-hydrazine productformation, was frequently observed during soaks.

DISCUSSION OF RESULTS

Initial experiments were performed using reaction cocktails thatcontained a single isatin and a single hydrazine. In these instances,product binding was observed. Previous studies (Bramson, H. N., et al.,J. Med Chem., 44, 4339–4358 (2001)) suggested that inhibitors derivedfrom A5 (see FIG. 1 & Table 2) should possess a relatively high degreeof potency due to the presence of the sulphonamide group at the R³position in the ligand. A reaction cocktail containing B2 and (A1–A4)did not reveal ligand binding, suggesting that the chlorinesubstitutions at positions (R¹–R³) did not confer significant potencyupon the product ligands. A cocktail that contained B2+(A1–A6) did,however, reveal ligand binding; the electron density being consistentwith product formed by the reaction of (B2+A5). Thus the (B2+A5)reaction product was preferentially selected from a set of six possibleproducts, lending support to the notion that the R³-sulphonamide groupwas conferring binding affinity upon the A5B2 ligand. The degeneracy ofthe product library was increased to 30 possible ligands using areaction cocktail that contained isatins (B1–B5) and hydrazines (A1–A6).Each isatin was present at a concentration of 0.2 mM and each hydrazineat a concentration of 1.67 mM (see Table 3). Difference electron densityconsistent with the reaction product from (B2+A5) was again observed.These studies suggest that the protein preferentially selects the A5B2ligand from the library of available product ligands. The potency ofthis compound was confirmed by synthesis of A5B2 ligand. It is knownfrom the literature that compound A5B2 has an IC50 of 43 nM (Bramson, etal., 2001).

1. A method for identifying a ligand of a target macromoleculecomprising the steps of: a) soaking one or more crystals of the targetmacromolecule in a solution containing a collection of compoundsgenerated in situ or separate from the crystal, where the solution hasbeen prepared without the purification of the synthesized collection ofcompounds; b) obtaining an X-ray crystal diffraction pattern of thesoaked macromolecule crystal; and c) using said X-ray crystaldiffraction pattern to identify any compound bound to the macromoleculecrystal, said compound being a ligand of the target macromolecule.
 2. Amethod according to claim 1, wherein the target macromolecule isselected from the group consisting of: proteins, ribose nucleic acids,deoxy ribose nucleic acid, and complexes of combinations of these.
 3. Amethod according to claim 2, wherein the target macromolecule is aprotein.
 4. A method according to claim 1, wherein the collection ofcompounds are synthesised individually and then mixed together.
 5. Amethod according to claim 1, wherein the collection of compounds aresynthesised as a mixture by combinatorial chemistry.
 6. A methodaccording to claim 1, wherein the members of the collection of compoundsare present at a concentration of at least 10 times their K_(i).
 7. Amethod according to claim 1, wherein the amount of each compound being amember of the collection of compounds, present in the solution will bepresent at a concentration which is at least 10 times as much as theconcentration of the target macromolecule in the reaction system.
 8. Amethod according to of claim 1, wherein the members of the collection ofcompounds do not bind covalently to the target macromolecule.
 9. Amethod for identifying a ligand of a target macromolecule comprising thesteps of: a) synthesizing a collection of compounds, which are suitablefor screening against a target macromolecule, in a solution containingone or more crystals of the target macromolecule; b) obtaining an X-raycrystal diffraction pattern of the soaked macromolecule crystal; and c)using said X-ray crystal diffraction pattern to identify any compoundbound to the macromolecule crystal, said compound being a ligand of thetarget macromolecule.
 10. A method for identifying a ligand of a targetmacromolecule comprising the steps of: a) synthesizing a collection ofunpurified compounds, which are suitable for screening against a targetmacromolecule; b) adding the collection of compounds to a solutioncontaining one or more crystals of the target macromolecule; c)obtaining an X-ray crystal diffraction pattern of the soakedmacromolecule crystal; and d) using said X-ray crystal diffractionpattern to identify any compound bound to the macromolecule crystal,said compound being a ligand of the target macromolecule.
 11. A methodaccording to claim 10, wherein if step a) takes place in a solvent whichis not compatible with the macromolecule crystals, then the methodcomprises the further step after step a) of separating the collection ofcompounds from the solvent in which the compounds were synthesised. 12.A method according to claim 10, wherein if step a) takes place in asolvent which is not compatible with the macromolecule crystals, thesolvent in which step a) takes place is separated from the solutioncontaining the one or more macromolecule crystals by a permeablemembrane.